Building Ontologies to Understand Spoken Tunisian Dialect

نویسندگان

  • Marwa Graja
  • Maher Jaoua
  • Lamia Hadrich Belguith
چکیده

This paper presents a method to understand spoken Tunisian dialect based on lexical semantic. This method takes into account the specificity of the Tunisian dialect which has no linguistic processing tools. This method is ontology-based which allows exploiting the ontological concepts for semantic annotation and ontological relations for speech interpretation. This combination increases the rate of comprehension and limits the dependence on linguistic resources. This paper also details the process of building the ontology used for annotation and interpretation of Tunisian dialect in the context of speech understanding in dialogue systems for restricted domain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-automatic Domain Ontology Construction from Spoken Corpus in Tunisian Dialect: Railway Request Information

In this paper, we present a hybrid method for semi-automatic building of domain ontology from spoken dialogue corpus in Tunisian Dialect for the railway request information domain. The proposed method is based on a statistical method for term and concept extraction and a linguistic method for semantic relation extraction. This method consists of three fundamental phases, namely the corpus const...

متن کامل

Mapping Rules for Building a Tunisian Dialect Lexicon and Generating Corpora

Nowadays in tunisia, the arabic Tunisian Dialect (TD) has become progressively used in interviews, news and debate programs instead of Modern Standard Arabic (MSA). Thus, this gave birth to a new kind of language. Indeed, the majority of speech is no longer made in MSA but alternates between MSA and TD. This situation has important negative consequences on Automatic Speech Recognition (ASR): si...

متن کامل

Building bilingual lexicon to create Dialect Tunisian corpora and adapt language model

Since the Tunisian revolution, Tunisian Dialect (TD) used in daily life, has became progressively used and represented in interviews, news and debate programs instead of Modern Standard Arabic (MSA). This situation has important negative consequences for natural language processing (NLP): since the spoken dialects are not officially written and do not have standard orthography, it is very costl...

متن کامل

Tunisian dialect Wordnet creation and enrichment using web resources and other Wordnets

In this paper, we propose TunDiaWN (Tunisian dialect Wordnet) a lexical resource for the dialect language spoken in Tunisia. Our TunDiaWN construction approach is founded, in one hand, on a corpus based method to analyze and extract Tunisian dialect words. A clustering technique is adapted and applied to mine the possible relations existing between the Tunisian dialect extracted words and to gr...

متن کامل

A Conventional Orthography for Tunisian Arabic

Tunisian Arabic is a dialect of the Arabic language spoken in Tunisia. Tunisian Arabic is an under-resourced language. It has neither a standard orthography nor large collections of written text and dictionaries. Actually, there is no strict separation between Modern Standard Arabic, the official language of the government, media and education, and Tunisian Arabic; the two exist on a continuum ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1109.0624  شماره 

صفحات  -

تاریخ انتشار 2011